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Abstract 


The basis of this analysis is a model presented at the ACM International Conference in 
New York on AI in Finance in October 2020 (Yang ef al., 2020). The authors claim that the 
Article Info introduced deep reinforcement learning ensemble model outperforms the Dow Jones 
Industrial Average Index, and the three individual algorithms that form the ensemble in 
terms of the risk-adjusted returns measured by the Sharpe ratio. Furthermore, it is claimed 
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1. Introduction 


The fund management industry may be organized into active and passive. Passive fund management tries to replicate 
the performance of an index and has become very popular since the Global Financial Crisis in the form of Exchange 
Traded Funds (Kenechukwu ef al., 2020). 


Active fund management aims to outperform the index via discretionary or automated strategies. With the ever 
increasing computational power, advances in the field of Artificial Intelligence', and the extraordinary success of 
pioneers like Jim Simons (Zuckerman, 2019), active, automated, quantitative approaches in the form of computer 
programs—commonly known as “algos’”—now make up the vast majority of trading volume on exchanges in the US?. 


* Corresponding author: Rainer Jager, Student of Computing & Mathematical Sciences, University of Waikato, Hamilton 3216, 
New Zealand. E-mail: rj63 @ students.waikato.ac.nz 


' “Recently, self-learning systems have achieved remarkable success in several challenging problems for artificial intelligence, by 


combining reinforcement learning with deep neural networks. In this talk, I describe the ideas and algorithms that led to AlphaGo: 
the first program to defeat a human champion in the game of Go; AlphaZero: which learned, from scratch, to also defeat the world 
computer champions in chess and shogi; and AlphaStar: the first program to defeat a human champion in the real-time strategy game 
of StarCraft." (Silver, 2021) 


i) 


“In the US stock market and many other developed financial markets, about 70-80% of overall trading volume is generated through 
algorithmic trading.” (January 22, 2021) 


2710-2599/© 2021. Rainer Jager. This is an open access article distributed under the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 
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The model which forms the basis of this analysis is one example of such a computer program (Yang ef al., 2020). 


The remainder of this paper introduces related work. It then gives a bird’s-eye-view of how the algorithm works. 
In section 4 we look at the results of the deep learning ensemble, discuss shortfalls of the model and suggest a work- 
around. The summary of our findings is the basis of a new model. We then compare the performance of the new model 
to the original model (Yang ef al., 2020). It concludes with ideas for future work. Appendix A contains the details to the 
statistical tests and Appendix B contains hitherto not mentioned experiments. 


2. Related Work 


The rise of the use of algos in finance as seen in the share of trading volume has mainly been driven by advances in 
the underlying computer programs. The development from machine learning to the use of artificial intelligence in the 
form of neural networks in general, and deep reinforcement learning algorithms in particular has its origins in the 
success of these algos in other domains, specifically in gaming (Silver, 2021). 


Deep Reinforcement learning combines the idea of object formulation and object optimization in the form of 
reinforcement learning and deep learning (Silver, 2021). These types of algorithms are typically categorized into either 
of the following three types: (a) actor-only; (b) critic-only; and (c) actor-critic (Fischer, 2018). 


The idea of the actor-only type is to learn from the observation state directly. The critic-only algorithms on the 
other hand are choosing their actions based on the value-network’s prediction. Finally, the actor-critic algorithm’s 
idea is to exploit the advantages of both types. It employs two agents: an actor deciding actions based on the state 
of the environment and a critic computing the rewards of those actions. The idea is that the actor’s network is 
gradually adjusted to maximize the rewards predicted by the critic (Fischer, 2018). 


The deep reinforcement learning ensemble model at the heart of this study is a pure actor-critic ensemble. All its 
three algos—Advantage Actor Critic (A2C), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy 
Optimization (PPO) fall into this category (Yang efal., 2020). 


3. Algorithm 


Firstly, the stock market data of the Dow Jones Industrial Index for its 30 members from 2009 to mid-2020 is preprocessed. 
This process results in consistent, that is adjusted for stock splits and dividends, end of day prices, price-based 
technical indicators’, and a turbulence‘ index (Figure 1). 


The individual agents then get trained from January 2009 to October 2015 based on the preprocessed data. The 
agent that achieved the highest risk-adjusted return, commonly known as Sharpe ratio’, in the three months prior to 
the start of the trading period—January 2016—is picked by the ensemble. 


This three month selection process is called the validation period. The selected algorithm then exclusively trades 
for the next three months. 


start start start 


training validation _ 


01/01/2009 10/01/2015 01/01/2016 05/12/2020 


SS AH YS 


In-sample Out-of-sample 


Figure 1: Stock Data Splitting. Adjusted from (Yang et al. , 2020) 


Technical indicators try to take advantage of market inefficiencies and are used in short-term trading. They are either trend- or 
momentum-based. The model introduces four such indicators to the observation space (Yang et al., 2020). 


Turbulencet = (y, — uw) =! (y, - w)' 2 € R, where y, — R? denotes the stock returns for current period t, ) ¢ R? denotes the average of 
historical returns, and © ¢ R>*P denotes the covariance of historical returns (Yang et al., 2020). 


We define the Sharpe ratio as: (annual return) / (annualized standard deviation of daily returns) 


Rainer Jager / IntJ.DataSci. & Big Data Anal. 1(3) (2021) 27-51 Page 29 of 51 


Prior to every day of trading the level of the current turbulence index is compared with a constant threshold and 
all positions are squared if the current index exceeds that threshold. Trading resumes when the turbulence index falls 
below the threshold. 


This training-validation-trading cycle gets extended by three months and repeated until the end. The stock market 
is modeled as a Markov Decision Process in a standard reinforcement learning environment (Figure 2). 


Reward: Profit/Loss 


Trading Agents Environment 


Ensemble Strategy 


PPO Action: sell/hold/buy 


Technical Indicators 


State: observations 


Figure 2: Modeling of a Stock-Market Environment from Yang et al. (2020) 


Stock prices, a money balance, the current portfolio, and four technical indicators are part of the observation 
space. Buy, sell, or holding of securities are part of the action space. Finally, a reward function to maximize the money 
balance is formulated. 


4. Experiments 


This report focuses on an analysis of the performance of the deep reinforcement learning ensemble model introduced 
in the paper and suggestions for improving its performance. The results of section 6.2 of the paper are based on only 
seven runs®°(Yang ef al., 2020). This is not sufficient to draw statistically supported conclusions (Bortz, 1993. Based 
on 30 runs each, the trading performance of the ensemble, its agents individually, and the Dow Jones Index is as 
follows in Table 1. 


Table 1: Performance Overview 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original DJIA30 | 
average annual return in 30 runs 10.34% | 9. | 7.78% | | 
max drawdown in 30 runs -11.06% -37.09% | | 
average Calmar (=average return/max drawdown) ratio over 30 runs 1,44 

median Sharpe ratio over 30 runs 1.27 


see backtesting.ipynb, January 24, 2021 on webpage: https://github.com/AI4Finance-LLC/Deep-Reinforcement-Learning-for- 
Automated-Stock-Trading-Ensembl e-Strategy-ICAIF-2020 
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The benchmark, DJTA30, is greyed out as it is the result of one backtest based on the data compared to the 30 runs 
each of the ensemble and its individual agents that were produced from stochastic learning processes. 


We can now test the paper’s claim of risk-adjusted outperformance of the ensemble compared to a buy and hold 
strategy of the benchmark index for statistical significance. 


A test’ of normal distribution for the ensemble’s Sharpe ratios produces a high p-value and hence the hypothesis 
of normally distributed Sharpe ratios cannot be dismissed. The z-value of the buy & hold Sharpe ratio is outside a 
95% confidence interval. Consequently, we agree with the paper’s claim of the ensemble’s risk-adjusted outperformance 
based on statistical significance. 


4.1. Analysis of Potential Shortfalls of the Ensemble Model 
4.1.1. Validation/Trading Period 


The ensemble model introduced uses a three month validation period at the end of which the agent with the highest 
Sharpe ratio during that period is picked to exclusively trade for the next three months (Yang e/ al., 2020). This logic 
is rolled forward until the end of the trading period. Additionally, it uses a constant turbulence index threshold level 
of 140 to avoid trading during unusual volatile periods. 


Is it possible to achieve better results with a different combination of validation/trading periods? Simulation of 30 
runs each produced the following results in Table 2. 


Table 2: Performance Overview for Different Validation/Trading Period Combinations 


start date | 2016-01-04 
end date | 2020-05-12 

| 
ensemble results (validation/trading) 2m/2m 3m/3m | 4m/4m | 5m/5m | 6m/6m 
average annual return in 30 runs 6.53% 10.34% | 10.95% | 9.34% | 7.05% 
max drawdown in 30 runs -22.03% |} -11.06% | -11.35% | -12.63% | -17.95% 
average Calmar ratio over 30 runs 0.38 1.44 1.22 1.05 0.60 
median Sharpe ratio 0.62 1.27 1.26 1.04 0.70 


It is evident that a 4m/4m combination of validation/trading period produced higher returns. However, the 3m/3m 
combination produced the highest risk-adjusted returns. We therefore agree that a 3m/3m validation/trading period 
combination chosen by the authors (Yang ef a/., 2020) produces the best risk-adjusted returns. 


4.1.2. Validation Rule 


Intuitively, it looks like a good idea to pick a trader that is currently in form to do the trading for the immediate future. 
This is essentially what the ensemble is doing by using a three months validation period and deciding the agent that 
is going to be trading for the next three months based on the agents’ risk-adjusted performance during the validation 
period. 


However, if we simulate 30 runs of an ensemble with the same validation/trading combination and a turbulence 
index of 140 that picks its agent to do the trading randomly, the results are as follows: 


The results produced by 30 simulations of the random-choice ensemble are better on all four performance criteria, 
notably it also produced better risk-adjusted returns. 


As the difference between the two models is very small, we conclude that the result of a ‘random-choice’ ensemble 
produces as good risk-adjusted returns as the original ensemble.’ 


4.1.3. Turbulence Index 


Firstly, we want to find the best threshold level possible: Introducing various threshold levels for the buy and hold 
strategy produces the following results for the period between January 4, 2016 to May 12, 2020: 


7 


See detailed calculations under Appendix A.1, and Appendix A.1.1 


8 


See detailed calculations under Appendix A.2 and Appendix A.2.1 
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Table 3: Performance of Original-Ensemble Versus a Random-Choice Ensemble 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original-ensemble | random-choice 


average annual return in 30 runs 10.34% 10.42% 
max drawdown in 30 runs -11.06% -10.19% 
average Calmar ratio over 30 runs 1.44 1.47 
median Sharpe ratio over 30 runs 1.27 1.29 


Table 4: Dow Jones Industrial Performance with Different Turbulence Index Levels 


turbulence index threshold 
annual return J 12.82% | 12.82% | 10.30% 


cumulative returns , 97% | 69.05% | 69.05% | 53.03% 
annual volatility % | 11.23% | 11.23% | 13.96% 


Sharpe ratio (r=0) 1.14 1.14 0.47 
max drawdown 14.62% | -14.62% | 20.57% | -21.71% 24.94% 
RoMaD ("Calmar’) 5 0.88 | 0.88 ; 031 


A t-test between the produced Sharpe ratios of a turbulence index threshold level of 120 and 140 for each quarter 
during the period 2010 to 2016 was not able to dismiss the hypothesis that both produce equally good risk-adjusted 
returns.’ We conclude that the threshold of 140 for the turbulence index as used in the paper (Yang ef al., 2020) is the 
right level if a turbulence index is used. 


It becomes clear that the introduction of the turbulence index for the benchmark vastly improves its risk-adjusted 
performance from a Sharpe ratio of 0.39 as per Tables | to 4. 


Even though the performance of the benchmark improves with the help of the turbulence index, the ensemble 
model still produces significantly better risk-adjusted returns. '° 


However, the introduction of a turbulence index threshold is problematic as it requires to set a constant variable 
in a dynamic environment. 
4.1.3.1, An Alternative to the Turbulence Index 


One way to tweak the ensemble and circumvent the problem of setting a constant in a dynamic environment and the 
critique of having its performance to compare to an equivalent benchmark is to replace the turbulence index with a 
new, observable variable like the VIX" (Table 5). 


This gives the ensemble the opportunity to learn adjusting its positions when there is turbulence ahead. Turbulence 
is then based on its own judgement. 


o: 


See detailed calculations under Appendix A.3 and Appendix A.3.1 


10 See detailed calculations under Appendix A.3.2 


i VIX data is freely available for download. "The VIX Index is a financial benchmark designed to be an up-to-the-minute market 


estimate of expected volatility of the S&P 500 Index, and is calculated by using the midpoint of real-time S&P 500® Index (SPX) 
option bid/ask quotes. More specifically, the VIX Index is intended to provide an instantaneous measure of how much the market 
thinks the S&P 500 Index will fluctuate in the 30 days from the time of each tick of the VIX Index.", 26th January 2021 from 
webpage: https://www.cboe.com/tradable_products/vix/faqS 


Rainer Jager / IntJ.DataSci. & Big Data Anal. 1(3) (2021) 27-51 Page 32 of 51 


Table 5: Performance of the Original Ensemble Versus an Ensemble Without a Turbulence Index but VIX Data 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original original w vix, no turb 
average annual return in 30 runs 10.34% 
max drawdown in 30 runs -11.06% 


average Calmar (=average return/max drawdown) ratio over 30 runs 1.44 


median Sharpe ratio over 30 runs 1.27 


Here is the result produced by 30 samples of the ensemble without a turbulence index but VIX data instead: 


We can conclude that the newly introduced ensemble’s risk-adjusted returns are equivalent to the original 
ensemble’s.'* Additionally, the problem of finding the optimal turbulence level and then setting this level as a constant 
has been avoided. 


4.1.4. Ensemble Versus Agents 


As indicated in Table 1, the performance of the ensemble is not better than its agents. In fact, it is worthwhile 
analyzing if the DDPG agent outperforms the ensemble significantly. Despite having outperformed the ensemble in all 
four performance criteria, the result of a t-Test'? on the produced Sharpe ratios shows that the risk-adjusted 
outperformance of the DDPG agent is not significant. 


4.1.4.1. Robustness of DDPG Results 


The authors Yang et al. (2020) claim to make the trading strategy more robust and reliable by deploying the ensemble. 
“Annualized volatility and max drawdown measure the robustness of a model.” (Yang e/ al., 2020) 


The claim of increasing the strategy’s robustness and reliability cannot be maintained. Even though the annualized 
volatility of the returns is significantly lower for the ensemble it is not the case for the drawdowns."* 
4.2. Summary of Findings 


The risk-adjusted performance of a single DDPG agent produces as high risk-adjusted returns as the original ensemble. 
Its results are as robust and reliable as the ensemble’s. 


Additionally, every three months the ensemble uses a validation period to pick the best-performing agent to do 
the trading for the next quarter. Given that the results of an ensemble that picks its trader randomly produces equally 
good results, we see no value in the validation process. 


Finally, we regard the use of a constant turbulence index level as problematic. 


We conclude that an improved model should be built around a DDPG agent only. This is computationally less 
expensive, and it eliminates the decision after the validation period in the original model. To further make use of our 
findings, we are looking at the performance of the DDPG model without a turbulence index but VIX data instead in our 
next section. 


5. A New Model 


Having shown the weaknesses of the ensemble, a new, improved model may be built as a sole 3m/3m DDPG agent with 
VIX data as part of the observation space instead of using a turbulence index level. 


The simulation of 30 runs arrives at the following results in Table 6. 


12 


See detailed calculations under Appendix A.4 and Appendix A.4.1 


13 


See detailed calculations under Appendix A.5 and Appendix A.5.1 


14 


See detailed calculations under Appendix A.6, Appendix A.6.1, Appendix A.7 and Appendix A.7.1 
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Table 6: Performance of the Original Ensemble Versus the New Model 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb Ivl == 140) original-ensemble| new model 
average annual return in 30 runs 10.34% 11.36% 
max drawdown in 30 runs -11.06% -9.41% 


average Calmar ratio over 30 runs 1.44 1.56 


median Sharpe ratio over 30 runs 1.27 1.39 


The results look very promising as the new model beats the original ensemble model in all categories. Although 
better, the risk-adjusted outperformance of the new model is not significant.'* Nevertheless, we think the newly 
introduced model is better compared to the original ensemble (Yang e/ al., 2020): 


° It produces equally good risk-adjusted returns. It Is equally robust'*. 
° It does not require a constant turbulence index threshold. 
° It does not require a validation period criteria. It is less computationally expensive. 


° It is simpler. 


6. Conclusions and Ideas for Future Work 


It is possible to improve the original ensemble model by focusing on its best performing agent and introducing VIX 
data instead of a turbulence index. 


There are plenty of areas for future work to improve the performance of new models that are either based on the 
original ensemble or the newly introduced model. 


We see opportunities in a revised ensemble algorithm that may not use its agents exclusively during the trading 
period but weights its actions as per the validation period performance. It is also worthwhile to consider additional 
agents or different agents as part of the ensemble. In addition to the authors Yang et al. (2020) suggested areas of 
future research, we also see potential in expanding the action space for short selling of stocks, and introducing 
money management rules (Kestner, 2003). 


As the ensemble is computationally expensive, there may be more scope for additional variables in the newly 
introduced model. Particularly trading volume is often used in context with technical indicators. It may be worth 
amending the model itself by introducing parameter noise. 


Probably the most challenging but also most promising area of research may be the introduction of new, deeper 
neural networks (February 4, 2021). 


Acknowledgment: This work was done under the supervision of Prof. Bernhard Pfahringer (bernhard@ waikato.ac.nz). 
His teachings in Data Mining, Machine Learning and Deep Learning helped me build a solid foundation to further 
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Appendix A: Statistical Tests 


A.1. D'Agostino and Pearson Test for Normal Distribution of the Sharpe Ratios Produced by the Ensemble 


In [110]: #test for normality 
import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=',') 
SR_of_3m3m_ensemble=x 
SR_of_3m3m_ensemble 
Out[11@]: array([1.4478, 1.3059, 1.4469, 1.4749, 1.4185, 1.3039, 
1.4861, 1.465 , 1.3416, 1.084 , 1.2838, 1.2358, 


1.1768, 1.2603, 1.03 , 1.2565, @.7485, 1.5445, 
1.0038, 1.3953, 1.2194, 1.1495]) 


In [106]: stats.normaltest(x) 


Out[106]: NormaltestResult(statistic=@.49447524345481364, pvalue=@.7809550995818826) 
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A.1.1. z-Value of Buy and Hold Strategy 


In [115]: #Sharpe Ratios: 
x 


Out[115]: array([1.4478, 1.4749, 1.4185, 1.3939, 
1.4861, 1. 1.084 , 1.2838, 1.2358, 
1.1768, : 1.2565, 0.7485, 1.5445, 
1.6038, 1.1495]) 


In [112]: #standard-error 
#mu 
#observattion 


se=stats.sem(x, axis=None, ddof=0) 
se 


mu=x.mean() 


#observed SR of buy & hold 
observation=9. 3886 


In [114]: z_value(mu,observation, se) 


Out[114]: 22.45 


A.2. D’ Agostino and Pearson Test for Normal Distribution of the Sharpe Ratios Produced by the Random- 
Choice-Ensemble 


In [797]: #test for normality 
# import scipy.stats as stats 


xX = genfromtxt('data/book3.csv', delimiter=",') 
SRs_of_random_choice_ensemble=x 
SRs_of_random_choice_ensemble 
Out[797]: array([1.4337, 1.3327, 1.16 , 1.5606, 1.6891, 1.2629, 1.3346, 
1.2473, 1.0366, 1.5943, 1.178 , 1.404 , 1.2926, 1.7552, 


1.0486, 1.337 , 1.4085, 1.8399, 1.2875, 0.9267, 1.197 , 
1.2949, 1.5583, @.757 , 1.2925, @.5695]) 


In [798]: stats.normaltest(x) 


Out[798]: NormaltestResult(statistic=1.8026670592751386, pvalue=0.4060278483906372) 
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A.2.1. t-Test of Sharpe Ratios of Original Ensemble Versus a Random-Choice Ensemble 


In [198]: genfromtxt('"data/booki.csv', delimiter=",') 


In [199]: 


Out[199]: array([1.4478, .4749, 1.4185, 1.3039, 
1.4861, .084 , 1.2838, 1.2358, 
1.1768, .2565, 0.7485, 1.5445, 
1.0038, .1495]) 


In [200]: x-.mean() 


Out[20@]: 1.2631133333333333 
In [201]: y=genfromtxt('data/book2.csv', delimiter=",') 


In [202]: y 


Out[202]: array([1.4337, .6891, 1.2629, 1.3346, 
1.2473, 3 -404 , 1.2926, 1.7552, 
1.0486, .2875, ©.9267, 1.197 , 
1.2949, .5695]) 


In [203]: y-.mean() 


Out[203]: 1.2978399999999999 
Specifics: 


original=x 
randomChoicezy 


os i 1.5 


In [216]: def compare_2_groups(arr_1, arr_2, alpha, sample size): 
stat, p=ttest_ind(arr_1, arr_2) 
print ("statistics=%.3f,p=%.3f" %(stat,p)) 
aif p>alpha: 
print("same distributiion (fail to reject H@)") 
else: 
print("different distribution (reject H@)") 


In [211]: sample_size=30 
alpha=@.@5 
compare_2_ groups(x,y, alpha, sample size) 


statistics=-0.529,p=0.599 
same distributiion (fail to reject H®@) 
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A.3. D’ Agostino and Pearson Test for Normal Distribution of the Sharpe Ratios Produced by the Threshold Levels 


120 and 140 


By splitting the period from January 5, 2010 to January 6, 2016, and running performance backtests for each 
quarter we get the following Sharpe ratios: 


Out [897]: 


In [898]: 


Out[898]: 


In [899]: 


Out [899]; 


In [900]: 


Out [920] : 


: #test for normality 


# import scipy.stats as stats 

X = genfromtxt('data/book3.csv', delimiter=', ') 

SRs_of_threshold 120=x 

SRs_of_threshold_120 

array([ 1.2536, -1.7108, 2.9851, 3.9061, 2.3434, 0.4598, -1.5971, 
3.5187, 3.401 , -0.2444, 1.6468, -@.1076, 4.9298, 1.4204, 


-0.4612, 3.1684, -1.4593, 2.0454, @.2135, @.6341, 0.7719, 
-@.7704, @,0755, -@.7144]) 


stats .normaltest (x) 


NormaltestResult(statistic=1.7393329574542284, pvalue=@.4190913@180555165) 


#test for normality 
# import scipy.stats as stats 


X = genfromtxt(‘data/book3.csv', delimiter=', ') 

SRs_of_threshold_148=x 

SRs_of_threshold_14é 

array({ 1.2536, -1.7108, 2.9851, 3.9061, 2.3434, 0.4598, -1.5971, 
3.5187, 3.401 , -@.5115, 1.6468, -@.1076, 4.9298, 1.4204, 


-0.4612, 3.7576, -0.3659, 2.0454, @.2135, @.8552, 0.5331, 
-0.7704, -@.7523, -0.084 ]) 


stats .normaltest(x) 


NormaltestResult(statistic=2.1384178849929008, pvalue=@.34327996423105567) 
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A.3.1. t-Test of Sharpe Ratios with Turbulence Threshold at 120 and 140 to Determine the Best Turbulence 
Threshold Level 


In [2]: _120lvl = genfromtxt(‘data/booki.csv', delimiter="',') 


In [3]: _1201lv1 


Out[3]: array([ 1.2536, -1.7108, 2.9851, 3.9061, 2.3434, -1.5971, 
3.5187, 3.401 , -@.2444, 1.6468, -0.1076, 1.4204, 
-@.4612, 3.1684, -1.4593, 2.0454, 0.2135, 0.7719, 
-@.7704, @.@755, -@.7144]) 


In [4]: _1201vl.mean() 


Out[4]: 1.0711791666666668 
In [5]: |_140lvl=genfromtxt('data/book2.csv*, delimiter=",') 


In [6]: _14@1lv1 


Out[6]: array([ 1.2536, 2.9851, 3.9061, 2.3434, -1.5971, 
3.5187, -@.5115, 1.6468, -@.1076, 1.4204, 
-0.4612, -@.3659, 2.0454, 0.2135, @.5331, 
-0.7704, -@.0@84 ]) 


In [7]: _140lvl.mean() 


Out[7]: 1.1211958333333334 


In [14]: def compare_2_groups(arr_1, arr_2, alpha, sample_size): 
stat, p=ttest_ind(arr_1, arr_2) 
print("statistics=%.3f,p=%.3f" %(stat,p)) 
if p>alpha: 

print("same distributiion (fail to reject H®)”) 
else: 


print("different distribution (reject He@)") 


sample _size=24 
alpha=@.65 
compare_2_groups(_12@lvl, _14@lvl, alpha, sample_size) 


statistics=-0.092,p=0.927 
same distributiion (fail to reject H@) 
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A.3.2. z-Value of Buy and Hold Strategy with Turbulence Index 


In [254]: #Sharpe Ratios of original ensemble: 
X 


Out[254]: array([1.4478, 1.3059, 1.5523, 1.0585, 1.4469, 1.4749, 1.4185, 1.3039, 
1.4861, 1.465 , 1.6915, 1.0201, 1.3416, 1.084 , 1.2838, 1.2358, 
1.1768, 1.2603, 0.9204, 1.0445, 1.03 , 1.2565, @.7485, 1.5445, 
1.0038, 1.3953, 1.4347, 1.8926, 1.2194, 1.1495]) 


#standard-error 
#mu 
#observation 


se=stats.sem(x, axis=None, ddof=0) 
$e 


mu=x.mean() 


#observed SR of buy & hold with turublence index at 146 
observation=0. 8765 


In [256]: z_value(mu, observation, se) 
Out[256]: 9.92 


A.4. D’ Agostino and Pearson Test for Normal Distribution of the Sharpe ratios Produced by the Ensemble with 
VIX but no Turbulence Index 


In [799]: #test for normality 
# import scipy.stats as stats 


X = genfromtxt('data/book3.csv', delimiter=',') 
SRs_of_ensemble_without_turb_but_vix=x 


SRs_of_ensemble_without_turb_but_vix 


Out[799]: array([@.8393, 1.3668, 1.1905, 1.4633, @.9208, 1.2724, 1.0045, 1.0524, 
1.3915, 1.2613, 1.0923, 0.7267, 0.8514, 1.0706, 1.4177, 1.1137, 
1.3775, 1.6473, 1.1027, 1.011 , 1.5544, 1.3034, 1.0736, 1.1478, 
1.9642, 0.9075, 1.2499, 0.8764, 1.572 , 1.4991]) 


In [888]: stats.normaltest(x) 


Out[8@@]: NormaltestResult(statistic=2.1771747473666867, pvalue=0.33669177759406466) 
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A.4.1. t-Test of Sharpe Ratios of Original Ensemble Versus an Ensemble Without the Turbulence Index but VIX 


Data 


In [802]: 


Out [802]: 


In [83]: 


Out [803]: 


In [804]: 


In [85]: 


Out [805]: 


In [806]: 


Out [806]: 


X 


array([1.4478, 
1.4861, 
1.1768, 
1.0038, 


x.mean() 


1. 2631133333333333 


y=genfromtxt('data/book2.csv', delimiter=", ') 


y 


array([@.8393, 
1.3915, 
1.3775, 
1.9642, 


y.mean() 


1. 210733333333333 


In [809]: ensemble_SRs=x 
ensemble_vix_no_turb_SRs=y 


sample_size=30 
alpha=@.@5 


.4749, 1.4185, 1.3039, 
.084 , 1.2838, 1.2358, 
.2565, @.7485, 1.5445, 
.1495]) 


.2724, 1.0045, 1.0524, 
.0706, 1.4177, 1.1137, 
.3034, 1.0736, 1.1478, 
.4991]) 


compare_2_groups(x,y, alpha, sample_size) 


statistics=0.810, p=0.421 


same distributiion (fail to reject HO) 
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A.5. D'Agostino and Pearson Test for Normal Distribution of the Sharpe Ratios Produced by the DDPG Agent 


#test for normality 
# import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=', ') 
SRs_of_DDPG=x 
SRs_of_DDPG 
Out[812]: array([1.0392, 1.2217, 1.1476, 1.2482, 1.1175, 0.9956, 
1.3375, 1.3897, 1.3492, 1.9681, 1.2713, 1.5596, 


1.0746, 1.3173, 1.5567, 1.4105, @.8659, 1.1368, 
1.1896, 1.6178, 1.4549, 1.4961]) 


In [813]: stats.normaltest(x) 


Out[813]: NormaltestResult(statistic=1.3780405173228611, pvalue=@.5020677246836951) 


A.5.1. t-Test of Original Ensemble’s Sharpe Ratios Versus DDPG’s Sharpe Ratios 


In [815]: |x 
Out[815]: array([1.4478, 1. : F .4749, 1.4185, 
1.4861, 1. = 2 -684 , 1.2838, 


1.1768, 1. ” . . -2565, ©.7485, 
1.0038, 1. 7 2 -1495]) 


In [816]: x.mean() 


Out[s816]: 1. 2631133333333333 
In [817]: y=genfromtxt( 'data/book2. 


In [818]: ly 


Out[818]: array([1.e392, . z - : 1175, 
1.3375, 1. . F Z -2713, 
1.0746, 1. - J E -8659, 
1.1896, 


In [819]: y-mean() 


Out[819]: 1.32263 


In [823]: ensemble_SRs=x 
DDPG_SRs=y 


sample_size=39 
alpha=e.65 
compare_2_groups(x,y, alpha, sample_size) 


statistics=-0.892,p=0.376 
same distributiion (fail to reject H@) 
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A.6. D'Agostino and Pearson Test for Normal Distribution of Standard-Deviations of Returns of Ensemble and 
DDPG Agent 


#test for normality 
import scipy.stats as stats 


= genfromtxt('data/book3.csv', delimiter=',') 
SDs_of_3m3m_ensemble=x 
SDs_of_3m3m_ensemble 
Out[5@8]: array([@.0761, 6.6784, E @.0771, @.0892, 0.9915, 
@.0796, 8.9757, . 6.0816, 0.0815, 9.0839, 


@.0793, 0.0824, : @.0843, 0.0853, 0.0842, 
@.0882, 0.8753, é @.e@84 J) 


In [509]: stats.normaltest(x) 


Out[569]: NormaltestResult(statistic=1.643188987926699, pvalue=8.4397299484733067) 


#test for normality 
import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=',') 
SDs_of_3m3m_DDPG=x 
SDs_of_3m3m_DDPG 


Out[539]: array([@.0928, 8.0849, : @.081 , 0.8866, 0.0836, 
@.0826, 2.0759, : 0.0814, 0.889 , 6.0911, 
@.0842, 8.0873, : @.084 , ©.0864, 0.0878, 
@.0914, 0.0852, : @.0841]) 


In [529]: stats.normaltest(x) 


Out[529]: NormaltestResult(statistic=6.17667206970169372, pvalue=0.9154532008515076 ) 


A.6.1. t-Test of Original Ensemble’s Volatility (=Standard Deviation) of Returns Versus DDPG’s Volatility of 
Returns 


In [711]: original_vola=x 
DDPG_vola=y 


sample _size=30 
alpha=0@.85 
compare 2 groups(x,y, alpha, sample size) 


statistics=-2.796, p=0.007 
different distribution (reject H@) 
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A.7. D’ Agostino and Person Test for Normal Distribution of Max Drawdowns of Ensemble and DDPG Agent 


Out [722]: 


Ini [723] 


Out [723]: 


Out [831]: 


In [832]: 


Out [832]: 


: #test for normality 


# import scipy.stats as stats 

x = genfromtxt('data/book3.csv', delimiter=', ') 

DDs_of_3m3m_ensemble=x 

DDs_of_3m3m_ensemble 

array([-@.0805, -@.0775, -@.0681, - -@.8509, 
~@.0987, -0.0624, -@.0906, - -2.0957, 
-@.0638, -@.0677, -@.0833, - -0,0678, 


-0.09 , -@.0805, -@.0747, - -0.0756, 
-0.053 , -0.0596]) 


stats.normaltest(x) 


NormaltestResult(statistic=1.2980572104994073, pvalue=0.5225531356709714) 


#test for normality 
# import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=', ') 
DDs_of_DDPG=x 
DDs_of_DDPG 


array([-@.0847, -8.08@3, 
-0.0729, -@.059 , 
-0.062 , -@.0773, 
-@.0623, -8.0748, 
-0.0631, -@.0598]) 


stats .normaltest(x) 


NormaltestResult (statistic=@.7182052148353113, pvalue=@.6983026966390398 ) 
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A.7.1. t-Test of Max Drawdowns of Original Ensemble Versus DDPG Agent 


In [842]: ensemble_DDs=x 


In [394]: 


In [395]: 


Out[395]: 


In [396]: 


Out [396]: 


In [397]: 


In [398]: 


Out [398]: 


In [399]: 


Out[399]: 


DDPG_DDs=y 


sample_size=30 
alpha=6.@5 
compare_2_groups(x,y, alpha, sample_size) 


statistics=-6.609, p=0.545 
same distributiion (fail to reject H@) 


x = genfromtxt('data/book1.csw', delimiter=",') 


x 


array([-@.0805, -@.0775, -@.0681, 
-@.0987, -0.0624, -0.0906, 
-@.0638, -@.0677, -0.0833, 
-@.09 , -@.0805, -0.0747, 
-@.053 , -@.0@596]) 


x.mean() 


-@.07458333333333332 
y=genfromtxt ('data/book2.csv', delimiter=' 


¥y 


array([-@.0847, -0.@803, 
-@.0729, -@.059 , 
-0.062 , -0.0773, 
-9.0623, -0.0748, 
-@.0631, -0.0598]) 


y-mean() 


-@.07242333333333333 
Specifics: 


original=x 
DDPG=y 


sample _size=30 
alpha=0.@5 
compare 2 groups(x,y, alpha, sample size) 


statistics=-®.609,p=0.545 
same distributiion (fail to reject H@) 
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A.8. D'Agostino and Pearson test for Normal Distribution of Sharpe Ratios of the New Model 


In [844]: #test for normality 
# import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=',') 
SRs_of_new_model=x 
SRs_of_new_model 


Out[844]: array([1.1097, 1.1193, 1.3093, 1.4602, 1.5594, 1.7784, 1.4893, 
1.3918, 1.4649, @.9673, 1.0103, 1.8763, 1.8193, 1.4146, 
1.4063, 1.0164, 1.3537, 8.9179, 1.0814, 1.4019, 1.3976, 
1.337 , 1.0302, 1.175 , 0.9504, 1.5153]) 


In [845]: stats.normaltest(x) 


Out[845]: NormaltestResult(statistic=0.4940739432758695, pvalue=@.7811118140143439) 


A.8.1. t-Test of Original Ensemble’s Sharpe Ratios Versus the New Model’s Sharpe Ratios 


In. [752)): genfromtxt('"data/book1l.csv', delimiter=",") 


ia [753]: 


Out[753]: array([1.- e a . . -4749, 1.4185, 1.3039, 
1. “ : J : -984 , 1.2838, 1.2358, 
1. = a E = -2565, 0.7485, 1.5445, 
ae “ “ “ 5 -1495]) 

In [754]: x.mean() 


Out[754]: 1.2631133333333333 
In [755]: y=genfromtxt('data/book2. delimiter=",' 


In [756]: |y 


Out[756]: array([1.1097, 1. : . -5594, 1.7784, 1.4863, 
1.3918, 1. : - -8763, 1.8193, 1.4146, 
1.4063, 1. F ; : -9814, 1.4019, 1.3976, 
1.337 , 1. . . s -5153]) 


In [757]: y-mean() 


Out[757]: 1.3413566666666668 


In [760]: original_SR=x 
new_model_SR=y 


sample_size=3e 
alpha=98.95 
compare_2_groups(x,y, alpha, sample_size) 


statistics=-1.261,p=@.212 
same distributiion (fail to reject H@) 
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A.9. D’ Agostino and Pearson Test for Normal Distribution of Standard-Deviations of the New Model 


In [761]: #test for normality 
# import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=',') 
SDs_of_newModel=x 


SDs_of_newModel 


Out[761]: array([@.@84 , @.0882, 0.089 , 0.0835, @.082 , 0.082 , 0.0781, 
0.0845, 0.0814, 0.0869, 0.0832, @.0825, 0.0891, 0.0815, 
0.0844, @.0844, 8.0928, 0.0849, @.089 , 6.0908, 0.0868, 
@.0839, @.0@842, 0.0827, 0.0838, @.2774]) 


In [762]: stats.normaltest(x) 


Out[762]: NormaltestResult(statistic=9.6862102862779538, pvalue=0.7095636018935232) 


A.9.1. t-Test of Original Ensemble’s Volatility (=Standard Deviation) of Returns Versus the New Model’s Volatility 
of Returns 


in Eves]: genfromtxt('data/book1.csv', delimiter=',') 


In [764]: 

Out[764]: array([e. : : .0759, @.0808, 0.0745, 
eS. : s _ .0816, ©.0733, 6.0875, 
e. : : ? -@8 , @.0838, 8.0762, 
8. ; , ; .@805]) 

In [765]: x.mean() 


Out[765]: @.e@8222999999999998 
In [766]: y=genfromtxt('data/book2. delimiter=",') 


In [767]: ly 


Out[767]: array([0®.084 , 0.0882, -082 , 6.082 , 6.0781, 
©.0845, 0.0814, .0825, ©.0891, 9.0815, 
@.0844, 8.0844, .089 , 6.0908, 08.6868, 
@.0839, 8.0842, .9774]) 


In [768]: y-mean() 


Out[768]: ©.e8478eeeee0ee08G01 


In [771]:  original_SR=x 
new_model_SR=y 


sample_size=30 
alpha=e@.65 
compare_2_groups(x,y, alpha, sample_size) 


statistics=-2.005,p=0.050e 
different distribution (reject H@) 
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A.10. D'Agostino and Person Test for Normal Distribution of Max Drawdowns of the New Model 


In [772]: 


Out[772]: 


In [773]: 


Out[773]: 


#test for normality 
# import scipy.stats as stats 


x = genfromtxt('data/book3.csv', delimiter=',') 
DDs_of_newModel=x 
DDs_of_newModel 
array([-@.0938, -@.0705, -@.0677, 
-@.0814, -0.087 , -@.073 , 
-@.0794, -.0708, -0.0766, 


-@.0941, -@.079 , -0.0684, 
-@.0595, -0.0579]) 


stats .normaltest(x) 


NormaltestResult(statistic=0.21860976900535734, pvalue=0.896457059964215) 


A.10.1. t-Test of Max Drawdowns of Original Ensemble Versus the New Model’s Max Drawdowns 


In [775]: 


Out[775]: 


In [776]: 


Out[776]: 


am) [42743 


In [778]: 


Out[778]: 


In [779]: 


Out[779]: 


-@.0805, -@.0775, -0.0681, 
-@.0987, -@.0624, -e.2e906, 
-@.0638, -@.0677, -@.0833, 
-@.09 , -@.0805, -0.0747, 
-@.@53 , -@.0596]) 


x.mean() 


-@.07458333333333332 


y=genfromtxt('data/book2.csv', delimiter=' 


y 


array([-6.0e938, i -@.0677, 
-@.0814, -e. -@.073 , 
-0.0794, -0. -0.0766, 
-8.0941, -@. -@.0684, 
-@.0595, 


y-mean() 


-9.07475666666666667 


In [783]: original _DDs=x 


new_model DDs=y 


sample_size=3e 
alpha=e8.9e5 
compare_2 groups(x,y, alpha, sample _size) 


statistics=0.053,p=e.958 
same distributiion (fail to reject HO) 
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Appendix B: Further Experiments 


B.1. DDPG Agent Without Any Technical Indicators, Nor VIX Data, Nor Turbulence Index, i.e., Only Price, 
Current Portfolio, and Money Balance in Observation Space 


Given that all the technical indicators used are derivatives of the price, we found it worthwhile to test a slimmed 
down DDPG agent's performance. Here is the result of 30 runs compared to the original ensemble: 


Table 7: Performance of the original ensemble versus a slimmed down DDPG agent 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original DDPG price only 
average annual return in 30 runs 10.34% 10.56% 
max drawdown in 30 runs -11.06% -8.49% 
average Calmar (=average return/max drawdown) ratio over 30 runs 1.44 1.59 
median Sharpe ratio over 30 runs 1.27 1.23 


Its risk-adjusted performance is no worse than the original ensemble's: 


In [862]: |x 
Out[862]: array([1.4478, 5 é -4749, 1.4185, 1.3039, 
1.4861, e = -084 , 1.2838, 1.2358, 


1.1768, : : : -2565, @.7485, 1.5445, 
1.0038, - : .1495]) 


In [863]: x.mean() 


Out[863]: 1.2631133333333333 
In [864]: y=genfromtxt('data/book2. delimiter=",') 


In [865]: y 
Out[865]: array([1.44e4, * -3685, 1.2557, 1.1943, 
1.4544, 2 . 2162, 1.4504, 1.4231, 


1.071 , 7 -@526, @.9508, 1.2717, 
1.1456, 3 .0798]) 


In [866]: y.mean() 


Out[866]: 1.2498633333333335 


In [869]: ensemble _SRs=x 
DDPG_price_only_SRs=y 


sample_size=30 
alpha=6.@65 
compare 2 groups(x,y, alpha, sample size) 


statistics=0.257,p=0.798 
same distributiion (fail to reject He) 
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B.2. DDPG Agent with Different Noise Levels 


The original ensemble uses 0.5 for the variability of the DDPG agent's action noise [8]: action_noise = 
OrnsteinUhlenbeckActionNoise(mean=np.zeros(n_actions), sigma=float(0.5) * np.ones(n_actions)) 


Running 30 simulations of the DDPG agent with different levels of this noise produced the following results: 


Table 8: Performance of the Original DDPG Agent Versus a DDPG with Lower and Higher Action Noise 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) DDPG (sigma=0.01) DDPG (sigma=0.5) DDPG (sigma=2.0) 
average annual return in 30 runs 11.17% 11.33% 10.95% 

max drawdown in 30 runs -12.74% -9.96% -12.19% 
average Calmar (=average return/max drawdown) ratio over 30 runs 1.54 1.63 1.54 
median Sharpe ratio over 30 runs 1.32 1.32 1.29 


B.3. DDPG Agent with Alternative Reward Function 


In order to minimize transaction costs further we tried to discourage the agent to trade excessively by amending 
the reward function. 


Original reward function: self.reward = end_total_asset - begin_total_asset 


Alternative reward function: self.reward = end_total_asset - begin_total_asset - (self.cost** 1.4) A simulation of 
30 runs produced the following results: 


Table 9: Performance of the Original DDPG Agent Versus a DDPG Agent with an Alternative Reward Function 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original DDPG = DDPG with alternative reward function 


average annual return in 30 runs 11.33% 11.76% 
max drawdown in 30 runs -9.96% -9.91% 
average Calmar (=average return/max drawdown) ratio over 30 runs 1.63 1.64 


median Sharpe ratio over 30 runs 1:32 1.38 


Even though the DDPG agent with the alternative reward function produced higher Sharpe ratios, the difference 
between the original DDPG risk-adjusted returns and this agent's Sharpe ratios are not significant: 
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In [886]: #test for normality 
# import scipy.stats as stats 


xX = genfromtxt('data/book3.csv', delimiter=',') 
SRs_of_DDPG_with_alternative_reward_function=x 
SRs_of_DDPG with_alternative reward function 
Out[886]: array([1.2593, 1.0673, 1.0608, 1.3885, 1.732 , 1.0888, 1.4992, 1.2907, 
1.2123, 1.3749, 1.5285, 1.4896, 1.3451, 1.0592, 1.5285, 1.6508, 


1.3073, 1.3317, 1.5628, 1.7402, 0.9809, 1.1237, 1.3979, 1.4064, 
1.4107, 1.1749, 1.611 , 1.396 , 1.7209, 1.286 ]) 


In [887]: stats.normaltest(x) 


Out[887]: NormaltestResult(statistic=1.1575529420367996, pvalue=0.5605838377050343) 


In [877]: x 
Out[877]: array([1.0392, 1.4445, 1.2482, 1.1175, 0.9956, 
1.3375, 1.3153, 1.9681, 1.2713, 1.5596, 


1.0746, 1.8996, 1.4105, @.8659, 1.1368, 
1.1896, 1.2506, 1.4961]) 


In [878]: x.mean() 


Out [878]: 1.32263 
In [879]: y=genfromtxt('data/book2.csv', delimiter=',') 


In [880]: y 


Out[880]: array([1.2593, 1.0608, 1.3885, 1.732 , 1.0888, 1.4992, 1.2907, 
1.2123, 1.5285, 1.4896, 1.3451, 1.0592, 1.5285, 1.6508, 
1.3073, 1.5628, 1.7402, @.9809, 1.1237, 1.3979, 1.4064, 
1.4107, 1.611 , 1.396 , 1.7209, 1.286 ]) 


In [881]: y.mean() 


Out[881]: 1.36753 
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In [885]: original_DDPG_SRs=x 
DDPG_with_alternative_reward_function_SRs=y 


sample_size=30 
alpha=6.65 
compare_2_groups(x,y, alpha, sample_size) 


statistics=-6.677,p=0.501 
same distributiion (fail to reject H@) 


B.4. Ensemble with Highest Return Instead of Highest Risk-Adjusted Return as Selection Criteria After 
Validation Period 


As the random-choice ensemble of 4.1.4 did no worse than the original ensemble, we explored an alternative 
agent selection process after the validation period based on the highest absolute return produced during the 
validation period instead of the highest risk-adjusted return: 


Table 10: Performance of the Original Ensemble Versus the Ensemble Based on Absolute Returns During the 
Validation Period 


start date 2016-01-04 
end date 2020-05-12 


3m/3m performance (turb lvl == 140) original ensemble based on absolute retums 
average annual return in 30 runs 10.34% 10.02% 

max drawdown in 30 runs -11.06% -11.88% 

average Calmar (=average return/max drawdown) ratio over 30 runs 1.44 1.35 

median Sharpe ratio over 30 runs 1.27 1.17 
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1(3), 27-51. doi: 10.51483/ IJ DSBDA.1.3.2021.27-51. 


